Leveraging augmentation techniques for tasks with unbalancedness within the financial domain: a two-level ensemble approach

نویسندگان

چکیده

Abstract Modern financial markets produce massive datasets that need to be analysed using new modelling techniques like those from (deep) Machine Learning and Artificial Intelligence. The common goal of these is forecast the behaviour market, which can translated into various classification tasks, such as, for instance, predicting likelihood companies’ bankruptcy or in fraud detection systems. However, it often case real-world data are unbalanced, meaning classes’ distribution not equally represented datasets. This gives main issue since any model trained according majority class mainly, leading inaccurate predictions. In this paper, we explore different augmentation deal with very unbalanced data. We consider a number publicly available datasets, then apply state-of-the-art strategies them, finally evaluate results several models on sampled performance approaches evaluated their accuracy, micro, macro F1 score, by analyzing precision recall over minority class. show consistent accurate improvement achieved when employed. obtained look promising indicate efficiency tasks. On basis results, present an approach focused tasks within domain takes dataset as input, identifies what kind technique use, applies ensemble all identified type input along methods tackle underlying classification.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leveraging Auxiliary Tasks for Document-Level Cross-Domain Sentiment Classification

In this paper, we study domain adaptation with a state-of-the-art hierarchical neural network for document-level sentiment classification. We first design a new auxiliary task based on sentiment scores of domain-independent words. We then propose two neural network architectures to respectively induce document embeddings and sentence embeddings that work well for different domains. When these d...

متن کامل

a synchronic and diachronic approach to the change route of address terms in the two recent centuries of persian language

terms of address as an important linguistics items provide valuable information about the interlocutors, their relationship and their circumstances. this study was done to investigate the change route of persian address terms in the two recent centuries including three historical periods of qajar, pahlavi and after the islamic revolution. data were extracted from a corpus consisting 24 novels w...

15 صفحه اول

network of phonological rules in lori dialect of andimeshk: a study within the framework of post-generative approach.

پژوهش حاضر ارائه ی توصیفی است از نظام آوایی گویش لری شهر اندیمشک، واقع در شمال غربی استان خوزستان. چهارچوب نظری این پژوهش، انگاره ی پسازایشی جزءمستقل می باشد. این پایان نامه شامل موارد زیر است: -توصیف آواهای این گویش به صورت آواشناسی سنتی و در قالب مختصه های زایشی ممیز، همراه با آوانوشته ی تفصیلی؛ -توصیف نظام آوایی گویش لری و قواعد واجی آن در چهارچوب انگاره ی پسازایشی جزءمستقل و معرفی برهم کن...

Techniques for augmentation of exogenous DNA uptake by ovine spermatozoa

Sperm mediated gene transfer can be an inexpensive and simple method in animal transgenesis; however its efficiency is poor, mainly due to the spermatozoa’s lesser uptake of exogenous DNA. In the present study, the effects of lipofection and other augmentation techniques, such as sperm freezing and spermatozoa treatment with triton X100 and DMSO, on exogenous DNA uptake by sheep spermatozoa and...

متن کامل

A game theoretical approach for pricing in a two-level supply chain considering advertising and servicing

This paper considers the advertising, pricing, and service decisions simultaneously to coordinate the supply chain with a manufacturer and a retailer. The amount of market demand is influenced by advertising, pricing and service decisions. In this paper, three well-known approaches to the game theory, including the Nash, the Stackelberg-retailer, and the cooperative game are exploited to study ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: EPJ Data Science

سال: 2023

ISSN: ['2193-1127']

DOI: https://doi.org/10.1140/epjds/s13688-023-00402-9